Using Search Results to Microaggregate Query Logs Semantically
نویسندگان
چکیده
Query log anonymization has become an important challenge nowadays. A query log contains the search history of the users, as well as the selected results and their position in the ranking. These data are used to provide a personalized re-ranking of results and trend studies. However, query logs can disclose sensitive information of the users. Hence, query logs must be submitted to an anonymization process to guarantee that: a) no sensitive information can be linked to an identity; b) the analysis of the anonymized data produces similar results than the original data, i.e. minimize data distortion. Latest anonymization approaches utilize microagreggation, a statistical disclosure control technique that provides a privacy comparable with k-anonymity, attempting to minimize the data distortion. We propose a new method that uses search results to optimize microaggregation, providing more data reliability than the existing methods.
منابع مشابه
Extracting Semantically Related Queries By Exploiting User Session Information
This paper presents a simple and very effective collaborative approach to generate semantically related queries to a user query by employing aggregated user session statistics, as captured by search engine query logs. We show empirical evidence that one of the main causes of the temporal correlation between semantically related queries, which was previously reported in the literature, is the fa...
متن کاملWhy Not Use Query Logs As Corpora?
Generally, every Web search engine logs the user sessions. These records, called query logs, contain valuable information about the behaviour of Internet users and their language. There are only a few experiments on mining query logs, but they confirm that query logs are very useful for designing natural language applications in Web retrieval. This paper shows how lexical and semantic informati...
متن کاملMining Search Subtopics from Query Logs
Web queries are usually short and ambiguous. Subtopic mining plays an important role in understanding user’s search intent and has attracted many researchers' attention. In this paper, we describe our approach to identify users’ intents from query logs, which is a subtopic mining subtask of the NTCIR-9 Intent task for Chinese. We extract queries that are semantically related to the original que...
متن کاملAnalysis of users’ query reformulation behavior in Web with regard to Wholis-tic/analytic cognitive styles, Web experience, and search task type
Background and Aim: The basic aim of the present study is to investigate users’ query reformulation behavior with regard to wholistic-analytic cognitive styles, search task type, and experience variables in using the Web. Method: This study is an applied research using survey method. A total of 321 search queries were submitted by 44 users. Data collection tools were Riding’s Cognitive Style A...
متن کاملQuery Log Mining in Search Engines
The Web is a huge read-write information space where many items such as documents, images or other multimedia can be accessed. In this context, several information technologies have been developed to help users to satisfy their searching needs on the Web, and the most used are search engines. Search engines allow users to find Web resources formulating queries (a set of terms) and reviewing a l...
متن کامل